Introduction (Tamer)

The Opioid Crisis is truly that - a crisis. Over the past 20 years,

Data

Prescription Rate Data (Tamer)

Overdose Rate Data (Chris)

Analysis

Prescription Rate vs. Overdose Rate (Chris)

  1. Show maps
  2. Ask Questions about whether one causes the other, etc.

Regression

# scatter / regression between prescription rate and overdose rate
mod <- lm(age_adjusted_rate ~ prescription_rate, data = full_data)
msummary(mod)
##                   Estimate Std. Error t value Pr(>|t|)   
## (Intercept)        7.56323    2.96818   2.548  0.01409 * 
## prescription_rate  0.10475    0.03524   2.972  0.00461 **
## 
## Residual standard error: 5.298 on 48 degrees of freedom
## Multiple R-squared:  0.1554, Adjusted R-squared:  0.1379 
## F-statistic: 8.835 on 1 and 48 DF,  p-value: 0.004609
ggplot(data = full_data, aes(x = prescription_rate)) +
  geom_histogram(bins = 15)

ggplot(data = full_data, aes(x = prescription_rate, y = age_adjusted_rate)) +
  geom_point() + 
  geom_abline(intercept = 7.56323, slope = 0.10475)

Clustering (Sean)

Text explaining why k-means

Determining the Optimal K

Explain silhouette score

silhouette_score <- function(k){
  km <- kmeans(full_data[, 2:3], centers = k, nstart = 20)
  score <- cluster::silhouette(km$cluster, dist(full_data[, 2:3]))
  mean(score[, 3])
}

k <- 2:5
avg_sil <- sapply(k, silhouette_score)
optimal_k <- which(as.data.frame(avg_sil)$avg_sil == max(avg_sil)) + 1
optimal_k
## [1] 2
km <- kmeans(full_data[, 2:3], centers = optimal_k, nstart = 20)
full_data <- mutate(full_data, cluster = as.character(km$cluster))

Clustering for 2014

Conclusion (Chris, Sean)